# AN EFFICIENT 2D-DISCRETE WAVELET TRANSFORM ARCHITECTURE FOR JPEG 2000 USING FIELD PROGRAMMABLE GATE ARRAYS

## V. Srinivasa Rao\*, Dr Rajesh K Panakala\*\* and Dr Rajesh Kumar. Pullakura\*\*\*

\* Shri Vishnu Engineering College for Women, Bhimavaram, A.P., INDIA. \*\* PVP Siddhartha Institute of Science and Technology, Vijayawada, A.P., INDIA. \*\*\* ECE Dept. Andhra University college of Engineering, Visakhapatnam, A.P., INDIA

**ABSTRACT:** In this paper a novel architecture for DWT computation of input image of size greater than 512 x 512 is designed and implemented on FPGA. DWT offer better subjective image quality compared to Discrete Cosine Transform (DCT)-based compressed images under low bit rates, due to DWT's inherent scalability and better decorrelation properties. DWT has traditionally been implemented by convolution or FIR filter bank structures. Such implementations require both a large number of arithmetic computations and a large storage features that doesn't support high speed or low power image or video processing applications. So a scheme called "Lifting Based DWT" is proposed which requires far fewer computations than the previous methods. The proposed architecture is implemented on XC3S500e-5fg320 FPGA and is operated at a maximum frequency of over 231.192 MHz and consumes area less than 30% of the CLB resources on FPGA.

KEYWORDS: Image compression, DWT, JPEG2000, FPGA.

## INTRODUCTION

The discrete wavelet transform (DWT) performs a multi-resolution signal analysis which has adjustable locality in both time and frequency domains. However, digital images and videos are still demanding in terms of storage space and transmission bandwidth. Lossy compression is necessary to bring these demands down to manageable level, but it introduces various types of artifacts, such as blockiness, blur, ringing, noise etc. Current hardware implementations of jpeg and jpeg2000 codec mostly target maximizing speed performance so as to achieve real time behavior even when compressing image with very high resolutions. For the encoder comparison we compress a set of 29 test images with two JPEG encoders and three JPEG2000 encoders at various compression ratios. This paper proposes a JPEG encoder that targets minimal FPGA resource usage without compromising encoded image quality.

## ARCHITECTURE

The main feature of this Lifting based DWT scheme is to break up the high pass and low pass filters into a sequence of upper and lower triangular matrices and convert the filter implementation into banded matrix multiplications. The proposed architecture computes multi level DWT for both the forward and the inverse transforms one level at a time in a row column fashion.

#### **Discrete Wavelet Transform**

Discrete wavelet transform (DWT), which transforms a discrete time signal to a discrete wavelet representation. It converts an input series  $x_0$ ,  $x_1$ ,  $..x_m$ , into one high-pass wavelet coefficient series and one low-pass wavelet coefficient series.



Figure 1. Discrete wavelet transform

## **2D DWT Implementation**

Let h(z)' and g(z)' low pass and high pass analysis filters and let h(z) and g(z) be low pass and high pass synthesis filter. Then 2D DWT is implemented as follows:



Figure 2. DWT Implementation



Figure 3. Architecture of 2-D (5, 3) Discrete Wavelet Transform

Decomposition in 2D DWT is shown in fig 4



Figure 4. Decomposition in 2D DWT

Then the corresponding poly-phase matrices are defined as:

$$\tilde{P}(z) = \begin{bmatrix} \tilde{h}_e(z) & h_o(z) \\ \tilde{g}_e(z) & g_o(z) \end{bmatrix}$$
$$P(z) = \begin{bmatrix} h_e(z) & g_e(z) \\ h_o(z) & g_o(z) \end{bmatrix}$$

When the determinant of P(z) is unity, the synthesis filter pair (h, g) and the analysis filter pair (h,  $\tilde{g}$ ), are both complementary.

$$P(z) = \begin{bmatrix} h_e(z) & g_e(z) \\ h_o(z) & g_o(z) \end{bmatrix} = \prod_{i=1}^m \begin{bmatrix} 1 & s_i(z) \\ 0 & 1 \end{bmatrix}$$
  
primal lifting  
$$\begin{bmatrix} 1 & 0 \\ t_i(z) & 1 \end{bmatrix} \begin{bmatrix} K & 0 \\ 0 & 1/K \end{bmatrix}$$
  
dual lifting normalization

#### **IMPLEMENTATION**

The two types of lifting schemes are shown in the above figure. Scheme 1 P(z) which corresponds to the factorization consists of three steps. Predict step, where the even samples are multiplied by the time domain equivalent of t(z) and are added to the odd samples. Update step, where updated odd samples are multiplied by the time domain equivalent of s(z) and are added to the even samples. Scaling step, where the even samples are multiplied by 1/k and odd samples by k.

#### Lifting Scheme



Figure 5. Lifting Scheme

#### 382 Fifth International Conference on Recent Trends in Information Processing & Computing IPC-2016

The inverse DWT is obtained by traversing in the reverse direction changing the factor k to 1/k and reversing the signs of the co-efficients t(z) and s(z). The basic idea of lifting is the following: If a pair of filters (h, g) is complementary, that is it allows for perfect reconstruction, then for every filter the pair(h', g) with  $h'(z)=h(z)+s(z^2).g(z)$  allows for perfect reconstruction.

# **Device Utilization Summary of DWT Block**

| top_idwt Project Status (04/02/2013 - 14:43:18) |                           |                       |                           |  |  |  |  |  |  |  |  |
|-------------------------------------------------|---------------------------|-----------------------|---------------------------|--|--|--|--|--|--|--|--|
| Project File:                                   | idwtnew.xise              | Parser Errors:        | No Errors                 |  |  |  |  |  |  |  |  |
| Module Name:                                    | top_idwt                  | Implementation State: | Synthesized               |  |  |  |  |  |  |  |  |
| Target Device:                                  | xc3s500e-5fg320           | •Errors:              | No Errors                 |  |  |  |  |  |  |  |  |
| Product Version:                                | ISE 12.2                  | •Warnings:            | <u>3 Warnings (3 new)</u> |  |  |  |  |  |  |  |  |
| Design Goal:                                    | Balanced                  | •Routing Results:     |                           |  |  |  |  |  |  |  |  |
| Design Strategy:                                | Xilinx Default (unlocked) | • Timing Constraints: |                           |  |  |  |  |  |  |  |  |
| Environment:                                    | System Settings           | •Final Timing Score:  |                           |  |  |  |  |  |  |  |  |

| Device Utilization Summary (estimated values) |      |           |             |     |  |  |  |  |  |  |
|-----------------------------------------------|------|-----------|-------------|-----|--|--|--|--|--|--|
| Logic Utilization                             | Used | Available | Utilization |     |  |  |  |  |  |  |
| Number of Slices                              | 769  | 4656      |             | 16% |  |  |  |  |  |  |
| Number of Slice Flip Flops                    | 909  | 9312      |             | 9%  |  |  |  |  |  |  |
| Number of 4 input LUTs                        | 1477 | 9312      |             | 15% |  |  |  |  |  |  |
| Number of bonded IOBs                         | 109  | 232       |             | 46% |  |  |  |  |  |  |
| Number of GCLKs                               | 1    | 24        |             | 4%  |  |  |  |  |  |  |

| Detailed Reports   |         |                         |        |                           |                   |         |  |  |  |  |  |
|--------------------|---------|-------------------------|--------|---------------------------|-------------------|---------|--|--|--|--|--|
| Report Name        | Status  | Generated               | Errors | Warnings                  | Infos             |         |  |  |  |  |  |
| Synthesis Report   | Current | Tue Apr 2 14:43:16 2013 | 0      | <u>3 Warnings (3 new)</u> | <u>60 Infos (</u> | 60 new) |  |  |  |  |  |
| Translation Report |         |                         |        |                           |                   |         |  |  |  |  |  |

# RESULTS

#### **Simulation Results**

| Inputs                              |                   |            |           |           |             |           |           |       |       |       |       |       |       |      |
|-------------------------------------|-------------------|------------|-----------|-----------|-------------|-----------|-----------|-------|-------|-------|-------|-------|-------|------|
| խ /top_tb_v/dk                      | 0                 |            |           |           |             |           |           |       |       |       |       |       |       |      |
| /top_tb_v/rst                       | 0                 |            |           |           |             |           |           |       |       |       |       |       |       |      |
| խ /top_tb_v/data_in                 | 180               | 198        | 123       | 180       |             |           |           |       |       |       |       |       |       |      |
| DWT_1                               |                   |            |           |           |             |           |           |       |       |       |       |       |       |      |
| /top_tb_v/uut/e1/even               | 154 136 123 154 : | 154 136 12 | 3)154     | 136 123 1 | 54 254 16   | 7 166 123 |           |       |       |       |       |       |       |      |
| /top_tb_v/uut/e1/odd                | 123 192 180 110 : | 12 (123    | 192 180 1 | 11 (123   | 192 180 1   | 10 239 14 | 9 198 180 |       |       |       |       |       |       |      |
| խ /top_tb_v/uut/e1/load             | 1                 |            |           |           |             |           |           |       |       |       |       |       |       |      |
| /top_tb_v/uut/e1/temp_even          | 123               | 0          |           |           | (154        | (136      | 123       | (154  | 254   | (167  | (166  | (123  |       |      |
| /top_tb_v/uut/e1/m2_out             | 123               | 0          |           |           |             | (154      | 136       | 123   | )154  | 254   | 167   | (166  | )123  |      |
| 🍫 /top_tb_v/uut/e1/m3_in            | 246               | 0          |           |           | (154        | 290       | 259       | 277   | 408   | (421  | 333   | 289   | 246   |      |
| /top_tb_v/uut/e1/m3_in_temp         | 123               | 0          |           |           | )77         | (145      | 129       | (138  | 204   | 210   | (166  | (144  | )123  |      |
| /top_tb_v/uut/e1/m4_out             | -123              | 0          |           |           | -77         | -145      | )·129     | (-138 | )-204 | (-210 | )-166 | (-144 | )-123 |      |
| /top_tb_v/uut/e1/temp_odd           | 180               | 0          |           |           | )123        | (192      | )180      | (110  | 239   | (149  | (198  | (180  |       |      |
| /top_tb_v/uut/e1/m5_out             | 180               | 0          |           |           |             | (123      | 192       | (180  | )110  | 239   | (149  | (198  | )180  |      |
| /top_tb_v/uut/e1/m5_odd_out_top1    | 57                | 0          |           |           | <b>.</b> 77 | -22       | )63       | (42   | )-94  | 29    | -17   | (54   | )57   |      |
| <pre>/top_tb_v/uut/e1/m7_out</pre>  | 57                | 0          |           |           |             | (-77      | -22       | 63    | (42   | (-94  | 29    | (-17  | )54   | (57  |
| /top_tb_v/uut/e1/m8_in              | 114               | 0          |           |           | .77         | -99       | )41       | (105  | (-52  | (-65  | )12   | 37    | )111  | (114 |
| /top_tb_v/uut/e1/m9_out             | 114               | 0          |           |           | .77         | -99       | )41       | (105  | )-52  | (-65  | )12   | 37    | )111  | (114 |
| /top_tb_v/uut/e1/m2_temp            | 123               | 0          |           |           |             |           | (154      | (136  | )123  | (154  | 254   | 167   | (166  | (123 |
| /top_tb_v/uut/e1/m10_even_out_down1 | 237               | 0          |           |           | )-77        | (-99      | )195      | 241   | )71   | 89    | 266   | 204   | 277   | 237  |

Figure 6. Simulation Result of DWT-1 Block with Both High and Low Pass coefficients

| — DWT_2 ———                        |               |      |                      |      |       |        |        |         |             |             |                   |       |     |     |
|------------------------------------|---------------|------|----------------------|------|-------|--------|--------|---------|-------------|-------------|-------------------|-------|-----|-----|
| <pre>/top_tb_v/start</pre>         |               |      |                      |      |       |        |        |         |             |             |                   |       |     |     |
| 🖅 🎝 /top_tb_v/uut/e2/data_in       | 57            | 42   | -94                  | (29  | -17   | )54    | )57    |         |             |             |                   |       |     |     |
| 🛯 🎝 /top_tb_v/uut/e2/even1         | -22 42 29 54  | -22  | () <del>-</del> 22 ( | 2xx  | -22 4 | 2 29 x | -22 4  | 2 29 54 |             |             |                   |       |     |     |
| ₽-� /top_tb_v/uut/e2/odd1          | 63 -94 -17 57 | 63 X | XX                   | 63 - | 94xx  | 63-9   | 4-17 x | 63 -    | 4 - 17 57   |             |                   |       |     |     |
| 🖕 /top_tb_v/uut/e2/load            |               |      |                      |      |       |        |        |         |             |             |                   |       |     |     |
| ₽-� /top_tb_v/uut/e2/temp_odd_even | 54            | 0    |                      |      |       |        |        |         | -22         | 42          | 29                | 54    |     |     |
| ₽-∲ /top_tb_v/uut/e2/m2_out        | 54            | 0    |                      |      |       |        |        |         |             | -22         | 42                | (29   | 54  |     |
| ₽-�/top_tb_v/uut/e2/m3_in          | 108           | 0    |                      |      |       |        |        |         | <u>-</u> 22 | 20          | )71               | 83    | 108 |     |
| 🖅 🎝 /top_tb_v/uut/e2/m3_in_temp    | 54            | 0    |                      |      |       |        |        |         | -11         | )10         | 35                | (41   | 54  |     |
| ₽-�/top_tb_v/uut/e2/m4_out         | -54           | 0    |                      |      |       |        |        |         | 11          | <u>}-10</u> | -35               | -41   | -54 |     |
| •                                  | 57            | 0    |                      |      |       |        |        |         | 63          | -94         | <b>)</b> -17      | (57   |     |     |
| 🖅 🎝 (top_tb_v/uut/e2/m5_out        | 57            | 0    |                      |      |       |        |        |         |             | 63          | <u>)</u> -94      | (-17  | 57  |     |
| 🛯 🎝 (top_tb_v/m5_add_add_aut_tap1  |               | 0    |                      |      |       |        |        |         | <u>(11</u>  | 53          | <u>)</u> -129     | -58   | 3   |     |
| ₽-☆ /top_tb_v/uut/e2/m7_out        |               | 0    |                      |      |       |        |        |         |             | )11         | )53               | -129  | -58 | 3   |
| 🛯 🚽 /top_tb_v/uut/e2/m8_in         |               | 0    |                      |      |       |        |        |         | 11          | )64         | <del>)</del> -76  | -187  | -55 | 6   |
| ₽-\$ /top_tb_v/uut/e2/m9_out       |               | 0    |                      |      |       |        |        |         | <u> 11</u>  | 64          | <del>)</del> -76  | -187  | -55 | 6   |
| •                                  | 54            | 0    |                      |      |       |        |        |         |             |             | -22               | (42   | 29  | 54  |
| Itop_tb_v/m5_odd_even_out_top1     | 60            | 0    |                      |      |       |        |        |         | 11          | )64         | ) <del>.</del> 98 | (-145 | -26 | )60 |

Figure 7. Simulation Result of DWT-2 Block with Both High and Low Pass Coefficients



Figure 8. Simulation Result of IDWT-2 Block



Figure 9. Simulation Result of IDWT-1 Block with Final Pixel Values

## CONCLUSION

Wavelet-based coding provides substantial improvement in picture quality at low bit rates because of overlapping basis functions and better energy compaction property of wavelet transforms. Wavelet-based coders facilitate progressive transmission of images thereby allowing variable bit rates. Basically the medical images need more accuracy without loosing of information. The Discrete Wavelet Transform (DWT) was based on time-scale representation, which provides efficient multi-resolution. Hence it has been analyzed that the Discrete wavelet

384 Fifth International Conference on Recent Trends in Information Processing & Computing IPC-2016

transform (DWT) and inverse discrete wavelet (IDWT) transform operates at a maximum clock frequency of 113.841 MHz and 80.290 MHz respectively.

#### REFERENCES

- [1] A. Staller, P. Dillinger, and R.Manner,Implementation of the JPEG2000 standard on a Virtex 1000 FPGA.Springer Berlin/Heidelberg Publishers,2014
- [2] G.K. Wallace, "The JPEG still picture compression standard" IEEE transactions on consumer Electronics , vol.38,no,1,1992.
- [3] Digital Compression and Coding Continuous tone still images, part1, requirements and guidelines.ISO /IEC jtc1 draft international standard 10918-1, Nov. 1991.
- [4] A. Tumeo, M. F Monchiero, G. Palemo, F. Ferrandi, and d. Sciuto, "An internal partial Dynamic reconfiguration implementation of the JPEG encoder for low-cost FGPAs," IEEE computer society Annual symposium on vlsi(ISVLSI), PP.449-450, 2013.
- [5] M.D. Adams, "Jasper project home page, "http://www.ece.ubc.ca/~madadams/jasper, 2000.
- [6] J.H. Kasner, M.W. Marcellin, and B. R.Hunt,"Universal trellis code quantization," IEEETrans. On Image processing,
- [7] vol.8, no. 12, pp.1677-1687, Dec.1999.
- [8] T. Acharya and C. Chakrabarti, "A Survey on Lifting-Based Discrete Wavelet Transform Architectures," The Journal of VLSI Signal Processing.
- [9] M. Angelopoulou, K. Masselos, Y. ndreopoulos, and P. Cheung "Implementation and Comparison of the 5/3 Lifting 2D Discrete Wavelet Transform Computation Schedules on FPGAs," The Journal of Signal Processing Systems.
- [10] Sze-Wei Lee and Soon-Chieh Lim, "VLSI Design of a Wavelet Processing Core," IEEE Trans. Circuits Syst. Video Technol.